481 research outputs found

    METANNOGEN: compiling features of biochemical reactions needed for the reconstruction of metabolic networks

    Get PDF
    BACKGROUND: One central goal of computational systems biology is the mathematical modelling of complex metabolic reaction networks. The first and most time-consuming step in the development of such models consists in the stoichiometric reconstruction of the network, i. e. compilation of all metabolites, reactions and transport processes relevant to the considered network and their assignment to the various cellular compartments. Therefore an information system is required to collect and manage data from different databases and scientific literature in order to generate a metabolic network of biochemical reactions that can be subjected to further computational analyses. RESULTS: The computer program METANNOGEN facilitates the reconstruction of metabolic networks. It uses the well-known database of biochemical reactions KEGG of biochemical reactions as primary information source from which biochemical reactions relevant to the considered network can be selected, edited and stored in a separate, user-defined database. Reactions not contained in KEGG can be entered manually into the system. To aid the decision whether or not a reaction selected from KEGG belongs to the considered network METANNOGEN contains information of SWISSPROT and ENSEMBL and provides Web links to a number of important information sources like METACYC, BRENDA, NIST, and REACTOME. If a reaction is reported to occur in more than one cellular compartment, a corresponding number of reactions is generated each referring to one specific compartment. Transport processes of metabolites are entered like chemical reactions where reactants and products have different compartment attributes. The list of compartmentalized biochemical reactions and membrane transport processes compiled by means of METANNOGEN can be exported as an SBML file for further computational analysis. METANNOGEN is highly customizable with respect to the content of the SBML output file, additional data-fields, the graphical input form, highlighting of project specific search terms and dynamically generated Web-links. CONCLUSION: METANNOGEN is a flexible tool to manage information for the design of metabolic networks. The program requires Java Runtime Environment 1.4 or higher and about 100 MB of free RAM and about 200 MB of free HD space. It does not require installation and can be directly Java-webstarted from

    Consistency, comprehensiveness, and compatibility of pathway databases

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>It is necessary to analyze microarray experiments together with biological information to make better biological inferences. We investigate the adequacy of current biological databases to address this need.</p> <p>Description</p> <p>Our results show a low level of consistency, comprehensiveness and compatibility among three popular pathway databases (KEGG, Ingenuity and Wikipathways). The level of consistency for genes in similar pathways across databases ranges from 0% to 88%. The corresponding level of consistency for interacting genes pairs is 0%-61%. These three original sources can be assumed to be reliable in the sense that the interacting gene pairs reported in them are correct because they are curated. However, the lack of concordance between these databases suggests each source has missed out many genes and interacting gene pairs.</p> <p>Conclusions</p> <p>Researchers will hence find it challenging to obtain consistent pathway information out of these diverse data sources. It is therefore critical to enable them to access these sources via a consistent, comprehensive and unified pathway API. We accumulated sufficient data to create such an aggregated resource with the convenience of an API to access its information. This unified resource can be accessed at <url>http://www.pathwayapi.com</url>.</p

    phorest: a web-based tool for comparative analyses of expressed sequence tag data

    Get PDF
    Comparative analysis of expressed sequence tags is becoming an important tool in molecular ecology for comparing gene expression in organisms grown in certain environments. Additionally, expressed sequence tag database information can be used for the construction of DNA microarrays and for the detection of single nucleotide polymorphisms. For such applications, we present PHOREST, a web-based tool for managing, analysing and comparing various collections of expressed sequence tags. It is written in PHP (PHP: Hypertext Preprocessor) and runs on UNIX, Microsoft Windows and Macintosh (Mac OS X) platforms

    A survey of orphan enzyme activities

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Using computational database searches, we have demonstrated previously that no gene sequences could be found for at least 36% of enzyme activities that have been assigned an Enzyme Commission number. Here we present a follow-up literature-based survey involving a statistically significant sample of such "orphan" activities. The survey was intended to determine whether sequences for these enzyme activities are truly unknown, or whether these sequences are absent from the public sequence databases but can be found in the literature.</p> <p>Results</p> <p>We demonstrate that for ~80% of sampled orphans, the absence of sequence data is bona fide. Our analyses further substantiate the notion that many of these enzyme activities play biologically important roles.</p> <p>Conclusion</p> <p>This survey points toward significant scientific cost of having such a large fraction of characterized enzyme activities disconnected from sequence data. It also suggests that a larger effort, beginning with a comprehensive survey of all putative orphan activities, would resolve nearly 300 artifactual orphans and reconnect a wealth of enzyme research with modern genomics. For these reasons, we propose that a systematic effort to identify the cognate genes of orphan enzymes be undertaken.</p

    Multi-membership gene regulation in pathway based microarray analysis

    Get PDF
    This article is available through the Brunel Open Access Publishing Fund. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.Background: Gene expression analysis has been intensively researched for more than a decade. Recently, there has been elevated interest in the integration of microarray data analysis with other types of biological knowledge in a holistic analytical approach. We propose a methodology that can be facilitated for pathway based microarray data analysis, based on the observation that a substantial proportion of genes present in biochemical pathway databases are members of a number of distinct pathways. Our methodology aims towards establishing the state of individual pathways, by identifying those truly affected by the experimental conditions based on the behaviour of such genes. For that purpose it considers all the pathways in which a gene participates and the general census of gene expression per pathway. Results: We utilise hill climbing, simulated annealing and a genetic algorithm to analyse the consistency of the produced results, through the application of fuzzy adjusted rand indexes and hamming distance. All algorithms produce highly consistent genes to pathways allocations, revealing the contribution of genes to pathway functionality, in agreement with current pathway state visualisation techniques, with the simulated annealing search proving slightly superior in terms of efficiency. Conclusions: We show that the expression values of genes, which are members of a number of biochemical pathways or modules, are the net effect of the contribution of each gene to these biochemical processes. We show that by manipulating the pathway and module contribution of such genes to follow underlying trends we can interpret microarray results centred on the behaviour of these genes.The work was sponsored by the studentship scheme of the School of Information Systems, Computing and Mathematics, Brunel Universit

    ORENZA: a web resource for studying ORphan ENZyme activities

    Get PDF
    BACKGROUND: Despite the current availability of several hundreds of thousands of amino acid sequences, more than 36% of the enzyme activities (EC numbers) defined by the Nomenclature Committee of the International Union of Biochemistry and Molecular Biology (NC-IUBMB) are not associated with any amino acid sequence in major public databases. This wide gap separating knowledge of biochemical function and sequence information is found for nearly all classes of enzymes. Thus, there is an urgent need to explore these sequence-less EC numbers, in order to progressively close this gap. DESCRIPTION: We designed ORENZA, a PostgreSQL database of ORphan ENZyme Activities, to collate information about the EC numbers defined by the NC-IUBMB with specific emphasis on orphan enzyme activities. Complete lists of all EC numbers and of orphan EC numbers are available and will be periodically updated. ORENZA allows one to browse the complete list of EC numbers or the subset associated with orphan enzymes or to query a specific EC number, an enzyme name or a species name for those interested in particular organisms. It is possible to search ORENZA for the different biochemical properties of the defined enzymes, the metabolic pathways in which they participate, the taxonomic data of the organisms whose genomes encode them, and many other features. The association of an enzyme activity with an amino acid sequence is clearly underlined, making it easy to identify at once the orphan enzyme activities. Interactive publishing of suggestions by the community would provide expert evidence for re-annotation of orphan EC numbers in public databases. CONCLUSION: ORENZA is a Web resource designed to progressively bridge the unwanted gap between function (enzyme activities) and sequence (dataset present in public databases). ORENZA should increase interactions between communities of biochemists and of genomicists. This is expected to reduce the number of orphan enzyme activities by allocating gene sequences to the relevant enzymes

    Use of reconstituted metabolic networks to assist in metabolomic data visualization and mining

    Get PDF
    Metabolomics experiments seldom achieve their aim of comprehensively covering the entire metabolome. However, important information can be gleaned even from sparse datasets, which can be facilitated by placing the results within the context of known metabolic networks. Here we present a method that allows the automatic assignment of identified metabolites to positions within known metabolic networks, and, furthermore, allows automated extraction of sub-networks of biological significance. This latter feature is possible by use of a gap-filling algorithm. The utility of the algorithm in reconstructing and mining of metabolomics data is shown on two independent datasets generated with LC–MS LTQ-Orbitrap mass spectrometry. Biologically relevant metabolic sub-networks were extracted from both datasets. Moreover, a number of metabolites, whose presence eluded automatic selection within mass spectra, could be identified retrospectively by virtue of their inferred presence through gap filling

    RDFScape: Semantic Web meets Systems Biology

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The recent availability of high-throughput data in molecular biology has increased the need for a formal representation of this knowledge domain. New ontologies are being developed to formalize knowledge, e.g. about the functions of proteins. As the Semantic Web is being introduced into the Life Sciences, the basis for a distributed knowledge-base that can foster biological data analysis is laid. However, there still is a dichotomy, in tools and methodologies, between the use of ontologies in biological investigation, that is, in relation to experimental observations, and their use as a knowledge-base.</p> <p>Results</p> <p>RDFScape is a plugin that has been developed to extend a software oriented to biological analysis with support for reasoning on ontologies in the semantic web framework. We show with this plugin how the use of ontological knowledge in biological analysis can be extended through the use of inference. In particular, we present two examples relative to ontologies representing biological pathways: we demonstrate how these can be abstracted and visualized as interaction networks, and how reasoning on causal dependencies within elements of pathways can be implemented.</p> <p>Conclusions</p> <p>The use of ontologies for the interpretation of high-throughput biological data can be improved through the use of inference. This allows the use of ontologies not only as annotations, but as a knowledge-base from which new information relevant for specific analysis can be derived.</p

    Separating the wheat from the chaff: a prioritisation pipeline for the analysis of metabolomics datasets

    Get PDF
    Liquid Chromatography Mass Spectrometry (LC-MS) is a powerful and widely applied method for the study of biological systems, biomarker discovery and pharmacological interventions. LC-MS measurements are, however, significantly complicated by several technical challenges, including: (1) ionisation suppression/enhancement, disturbing the correct quantification of analytes, and (2) the detection of large amounts of separate derivative ions, increasing the complexity of the spectra, but not their information content. Here we introduce an experimental and analytical strategy that leads to robust metabolome profiles in the face of these challenges. Our method is based on rigorous filtering of the measured signals based on a series of sample dilutions. Such data sets have the additional characteristic that they allow a more robust assessment of detection signal quality for each metabolite. Using our method, almost 80% of the recorded signals can be discarded as uninformative, while important information is retained. As a consequence, we obtain a broader understanding of the information content of our analyses and a better assessment of the metabolites detected in the analyzed data sets. We illustrate the applicability of this method using standard mixtures, as well as cell extracts from bacterial samples. It is evident that this method can be applied in many types of LC-MS analyses and more specifically in untargeted metabolomics
    corecore